18. Other Applications of GANs

Other Interesting Applications of GANs

So far, you've seen a lot of examples of how GANs might be used for image generation and transformation. GANs are a relatively new formulation and so there are some really exciting research directions that include GANs. I didn't have time to cover them all in video, so I wanted to highlight a few of my favorite examples, here, and link to some resources that I've found helpful! This page is for those who are interested in learning more about GANs and curious to learn about semi-supervised learning.

1. Semi-Supervised Learning

Semi-supervised models are used when you only have a few labeled data points. The motivation for this kind of model is that, we increasingly have a lot of raw data, and the task of labelling data is tedious, time-consuming, and often, sensitive to human error. Semi-supervised models give us a way to learn from a large set of data with only a few labels, and they perform surprisingly well even though the amount of labeled data you have is relatively tiny. Ian Goodfellow has put together a video on this top, which you can see, below.

Semi-Supervised Learning

Semi-Supervised Learning in PyTorch

There is a readable implementation of a semi-supervised GAN in this Github repository. If you'd like to implement this in code, I suggest reading through that code!

2. Domain Invariance

Consider this car classification example. From the abstract, researchers (Timnit Gebru, et. al) wanted to:

develop a computer vision pipeline to predict income, per capita carbon emission, crime rates and other city attributes from a single source of publicly available visual data. We first detect cars in 50 million images across 200 of the largest US cities and train a model to predict demographic attributes using the detected cars. To facilitate our work, we have collected the largest and most challenging fine-grained dataset reported to date consisting of over 2600 classes of cars comprised of images from Google Street View and other web sources, classified by car experts to account for even the most subtle of visual differences.

One interesting thing to note is that these researchers obtained some manually-labeled Streetview data and data from other sources. I'll call these image sources, domains. So Streetview is a domain and another source, say cars.com is separate domain.

Different image sources for the paper, 
[Fine-Grained Car Detection for Visual Census Estimation](https://arxiv.org/abs/1709.02480)

Different image sources for the paper,
Fine-Grained Car Detection for Visual Census Estimation

The researchers then had to find a way to combine what they learned from these multiple sources! They did this with the use of multiple classifiers; adversarial networks that do not include a Generator, just two classifiers.

  • One classifier is learning to recognize car types
  • And another is learning to classify whether a car image came from Google Streetview or cars.com, given the extracted features from that image

So, the first classier’s job is to classify the car image correctly and to trick the second classifier so that the second classifier cannot tell whether the extracted image features indicate an image from the Streetview or cars.com domain!

The idea is: if the second classifier cannot tell which domain the features are from, then this indicates that these features are shared among the two domains, and you’ve found features that are domain-invariant.

Domain-invariance can be applied to a number of applications in which you want to find features that are invariant between two different domains. These can be image domains or domains based on different population demographics and so on. This is also sometimes referred to as adversarial feature learning.

3. Ethical and Artistic Applications: Further Reading

  • Ethical implications of GANs and when "fake" images can give us information about reality.
  • Do Androids Dream in Balenciaga? note that the author briefly talks about generative models having artistic potential rather than ethical implications, but the two go hand in hand. The generator, in this case, will recreate what it sees on the fashion runway; typically thin, white bodies that do not represent the diversity of people in the world (or even the diversity of people who buy Balenciaga).